ci(skills-eval): wire schedule to MANUAL_FULL_SWEEP + sync dispatch picker by richa-nvidia · Pull Request #701 · NVIDIA-AI-Blueprints/rag

richa-nvidia · 2026-06-29T09:37:02Z

Summary

Two independent fixes in .github/workflows/skills-eval.yml:

Schedule trigger now exports MANUAL_FULL_SWEEP=1 (same as workflow_dispatch already does). Before this, nightly runs crashed with FATAL: PR_NUMBER not set in environment — see GHA run 28214488174.
workflow_dispatch picker now lists the 3 skills that actually exist on disk (rag-blueprint, rag-eval, rag-perf) instead of 10 phantom names left over from an old taxonomy. The LLM already enumerates skills/*/eval/*.json itself, so the stale picker was a UX trap — anyone picking an old name got a silent BLOCKED. Closes the gotcha documented at ONBOARDING.md:487-492.

Local verification (all green)

YAML + workflow lint: parses clean, picker has exactly 4 options (* + 3 real skills).
Bash conditional simulation: confirmed event_name=schedule now sets MANUAL_FULL_SWEEP=1, MANUAL_SKILLS_FILTER=*; event_name=push still leaves them unset.
Python agent dry-run: reproduced the old FATAL: PR_NUMBER without the fix; with the fix the agent gets past the _require gate.

Why this is independent of #680

PR #680 fixed Brev auth + cleanup. Both passed cleanly on run 28214488174 (no GPU leak). This PR fixes the next failure exposed once cleanup stopped silently swallowing errors.

Test plan

Trigger workflow_dispatch on this branch — confirm picker only shows *, rag-blueprint, rag-eval, rag-perf, and the agent reaches [agent] starting instead of FATAL: PR_NUMBER.
First nightly after merge (2026-06-27 02:00 UTC) — confirm no FATAL: PR_NUMBER in logs.
After 7 days — audit brev ls for any leaked rag-eval-gpu-* instances (still expected: zero).

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Scheduled runs now use the same skill-selection behavior as manual runs, defaulting to all skills when no selection is provided.
- The skill selector list was simplified to the currently supported options while still allowing all skills to be selected.

…icker Two independent fixes in one file: 1. Schedule trigger now exports MANUAL_FULL_SWEEP=1 (like workflow_dispatch already does). Before this, the schedule event fell through to the PR-mode code path in skills_eval_agent.py and died with "FATAL: PR_NUMBER not set in environment". Also default MANUAL_SKILLS_FILTER to "*" when INPUT_SKILLS is empty (schedule events don't populate `inputs`). 2. workflow_dispatch picker now lists the 3 skills that actually exist on disk (rag-blueprint, rag-eval, rag-perf) instead of 10 phantom names (rag-deploy-blueprint, rag-ingest-documents, etc.) that were left over from an old skill taxonomy. The LLM agent already enumerates skills/*/eval/*.json itself, so the stale picker was a UX trap only — anyone picking one of the old names got a silent BLOCKED result. ONBOARDING.md:487-492 documents this as a known gotcha; this PR closes the gotcha. Discovered while diagnosing GHA run 28214488174 (first nightly after PR #680). The Brev auth step and cleanup step from #680 both passed cleanly — this is a separate, pre-existing bug surfaced by them running on schedule for the first time. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: richa <ricsingh@nvidia.com>

copy-pr-bot · 2026-06-29T09:37:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-29T09:37:21Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: b25f418a-3ffe-4e87-a746-815650476b56

📥 Commits

Reviewing files that changed from the base of the PR and between 3241f06 and 7208159.

📒 Files selected for processing (1)

.github/workflows/skills-eval.yml

📝 Walkthrough

Walkthrough

The skills-eval workflow trims the workflow_dispatch skill choices to rag-blueprint, rag-eval, and rag-perf. The agent run script's event gating is extended to cover both workflow_dispatch and schedule, setting MANUAL_SKILLS_FILTER from INPUT_SKILLS with a fallback to * for schedule events that carry no inputs.

Changes

Skills-eval Workflow

Layer / File(s)	Summary
Skill choices and filter unification `.github/workflows/skills-eval.yml`	Removes extra `rag-` options from the `skills` dispatch input, keeping only `rag-blueprint`, `rag-eval`, and `rag-perf`; extends the `MANUAL_SKILLS_FILTER` assignment to `schedule` events with `INPUT_SKILLS:-` fallback.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐇 Hop hop, fewer choices to chase,
Three rag skills now run the race.
Schedule joins the dispatch crew,
Filter stars when inputs are few.
Clean and neat, the workflow's ace! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the two main workflow fixes: scheduling now uses MANUAL_FULL_SWEEP and the dispatch picker was synced.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ci/skills-eval-schedule-fix

_{Comment @coderabbitai help to get the list of available commands.}

faaranm requested a review from shubhadeepd June 29, 2026 16:02

shubhadeepd approved these changes Jun 30, 2026

View reviewed changes

shubhadeepd merged commit 34cf9c8 into main Jun 30, 2026
10 of 11 checks passed

shubhadeepd deleted the ci/skills-eval-schedule-fix branch June 30, 2026 05:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ci(skills-eval): wire schedule to MANUAL_FULL_SWEEP + sync dispatch picker#701

ci(skills-eval): wire schedule to MANUAL_FULL_SWEEP + sync dispatch picker#701
shubhadeepd merged 1 commit into
mainfrom
ci/skills-eval-schedule-fix

richa-nvidia commented Jun 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

copy-pr-bot Bot commented Jun 29, 2026

Uh oh!

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

richa-nvidia commented Jun 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Local verification (all green)

Why this is independent of #680

Test plan

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Jun 29, 2026

Uh oh!

coderabbitai Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

richa-nvidia commented Jun 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading